[Feature](compaction) add CompactionTaskTracker with system table and HTTP API#61696
Draft
Yukang-Lian wants to merge 2 commits intoapache:masterfrom
Draft
[Feature](compaction) add CompactionTaskTracker with system table and HTTP API#61696Yukang-Lian wants to merge 2 commits intoapache:masterfrom
Yukang-Lian wants to merge 2 commits intoapache:masterfrom
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
fba3858 to
34f9678
Compare
… HTTP API Introduce CompactionTaskTracker to provide full lifecycle observability for base/cumulative/full compaction tasks, covering PENDING, RUNNING, FINISHED and FAILED states. Key changes: - Add CompactionTaskTracker singleton with push-based data collection and pull-based querying (system table + HTTP API) - Add information_schema.be_compaction_tasks system table (38 columns) with multi-BE fan-out via BackendPartitionedSchemaScanNode - Add GET /api/compaction/profile HTTP API with tablet_id/top_n/ compact_type/success filters - Integrate tracker at all compaction entry points (local background, cloud background, manual HTTP trigger, load-triggered) - Track vertical compaction progress (total_groups/completed_groups) via callback through Merger::vertical_merge_rowsets - Support three trigger methods: AUTO/MANUAL/LOAD_TRIGGERED - Add enable_compaction_task_tracker switch and configurable max_records (default 10000, ~5MB memory) - Add 15 unit tests and 2 regression tests
34f9678 to
fb5958b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduce
CompactionTaskTrackerto provide full lifecycle observability for base/cumulative/full compaction tasks. This PR adds two query channels:information_schema.be_compaction_taskssystem table (38 columns) — query via SQL across all BEsGET /api/compaction/profileHTTP API — query via JSON on a single BE, with filtering supportWhat problem does this PR solve?
Currently, compaction execution metrics (input/output data sizes, row counts, merge latency, etc.) are lost once the compaction object is destructed. Operators have no way to inspect historical compaction performance — only the current running status is available. Users also cannot see pending or running compaction task details through SQL.
This PR solves both problems with a unified
CompactionTaskTrackermechanism that tracks compaction tasks across their full lifecycle: PENDING → RUNNING → FINISHED/FAILED.Architecture
System Table:
information_schema.be_compaction_tasksSchema (38 columns)
base/cumulative/fullPENDING/RUNNING/FINISHED/FAILEDAUTO/MANUAL/LOAD_TRIGGEREDnow - start_time; FINISHED/FAILED:end - start[0-5]SQL Query Examples
HTTP API:
GET /api/compaction/profileQueries completed compaction profiles on a single BE with optional filtering. All parameters can be combined (AND logic).
Request Parameters
tablet_idtop_ncompact_typebase/cumulative/fullsuccesstrue/falseResponse Example
{ "status": "Success", "compaction_profiles": [ { "compaction_id": 487, "compaction_type": "cumulative", "tablet_id": 12345, "table_id": 67890, "partition_id": 11111, "trigger_method": "AUTO", "compaction_score": 10, "scheduled_time": "2025-07-15 14:02:30", "start_time": "2025-07-15 14:02:31", "end_time": "2025-07-15 14:02:31", "cost_time_ms": 236, "success": true, "input_rowsets_count": 5, "input_row_num": 52000, "input_data_size": 10706329, "input_index_size": 204800, "input_total_size": 10911129, "input_segments_num": 5, "input_version_range": "[12-16]", "merged_rows": 1200, "filtered_rows": 50, "output_rows": 50750, "output_row_num": 50750, "output_data_size": 5033164, "output_index_size": 102400, "output_total_size": 5135564, "output_segments_num": 1, "output_version": "[12-16]", "merge_latency_ms": 180, "bytes_read_from_local": 10706329, "bytes_read_from_remote": 0, "peak_memory_bytes": 33554432, "is_vertical": true, "permits": 10706329, "vertical_total_groups": 4, "vertical_completed_groups": 4 } ] }Failed compactions additionally include
"status_msg": "error message".curl Examples
Configuration
enable_compaction_task_trackertruecompaction_task_tracker_max_records10000Scope
Test plan
CompactionTaskTrackerTest.*) covering full lifecycle, failure paths, TRY_LOCK_FAILED cleanup, vertical progress, concurrent safety, config switch, filterstest_be_compaction_tasks(system table)test_compaction_profile_action(HTTP API)